Method for encoding and decoding audio at a variable rate

ABSTRACT

A maximum of Nmax bits for encoding is defined for a set of parameters which may be calculated from a signal frame. The parameters for a first sub-set are calculated and encoded with N0 bits, where N0&lt;Nmax. The allocation of Nmax−N0 encoding bits for the parameters of a second sub-set are determined and the encoding bits allocated to the parameters for the second sub-set are classified. The allocation and/or order of classification of the encoding bits are determined as a function of the encoding parameters for the first sub-set. For a total of N available bits for the encoding of the total parameters (N0&lt;N=Nmax), the parameters for the second sub-set allocated the N−N0 encoding bits classified the first in said order are selected. Said selected parameters are calculated and encoded to give the N−N0 bits. The N0 encoding bits for the first sub-set and the N−N0 encoding bits for the selected parameters for the second sub-set are finally introduced into the output sequence of the encoder.

The invention relates to devices for coding and decoding audio signals,intended in particular to sit within applications of transmission orstorage of digitized and compressed audio signals (speech and/orsounds).

More particularly, this invention pertains to audio coding systemshaving the capacity to provide varied bit rates, also referred to asmultirate coding systems. Such systems are distinguished from fixed ratecoders by their capacity to modify the bit rate of the coding, possiblyduring processing, this being especially suited to transmission overheterogeneous access networks: be they networks of IP type mixing fixedand mobile access, high bit rates (ADLS), low bit rates (RTC, GPRSmodems) or involving terminals with variable capacities (mobiles, PCs,etc.).

Essentially, two categories of multirate coders are distinguished: thatof “switchable” multirate coders and that of “hierarchical” coders.

“Switchable” multirate coders rely on a coding architecture belonging toa technological family (temporal coding or frequency coding, forexample: CELP, sinusoidal, or by transform), in which an indication ofbit rate is simultaneously supplied to the coder and to the decoder. Thecoder uses this information to select the parts of the algorithm and thetables relevant to the bit rate chosen. The decoder operates in asymmetric manner. Numerous switchable multirate coding structures havebeen proposed for audio coding. Such is the case for example with mobilecoders standardized by the 3GPP organization (“3rd GenerationPartnership Project”), NB-AMR (“Narrow Band Adaptive Multirate”,Technical Specification 3GPP TS 26.090, version 5.0.0, June 2002) in thetelephone band, or WB-AMR (“Wide Band Adaptive Multirate”, TechnicalSpecification 3GPP TS 26.190, version 5.1.0, December 2001) in wideband.These coders operate over fairly wide bit rate ranges (4.75 to 12.2kbit/s for NB-AMR, and 6.60 to 23.85 kbit/s for WB-AMR), with a fairlysizeable granularity (8 bit rates for NB-AMR and 9 for WB-AMR) .However, the price to be paid for this flexibility is a ratherconsiderable complexity of structure: to be able to host all these bitrates, these coders must support numerous different options, variedquantization tables etc. The performance curve increases progressivelywith bit rate, but the progress is not linear and certain bit rates arein essence better optimized than others.

In so-called “hierarchical” coding systems, also referred to as“scalable”, the binary data arising from the coding operation aredistributed into successive layers. A base layer, also called the“kernel”, is formed of the binary elements that are absolutely necessaryfor the decoding of the binary train, and determine a minimum quality ofdecoding.

The subsequent layers make it possible to progressively improve thequality of the signal arising from the decoding operation, each newlayer bringing new information which, utilized by the decoder, suppliesa signal of increasing quality at output.

One of the particular features of hierarchical coding is the possibilityoffered of intervening at any level whatsoever of the transmission orstorage chain so as to delete a part of the binary train without havingto supply any particular indication to the coder or to the decoder. Thedecoder uses the binary information that it receives and produces asignal of corresponding quality.

The field of hierarchical coding structures has given rise likewise tomuch work. Certain hierarchical coding structures operate on the basisof one type of coder alone, designed to deliver hierarchized codedinformation. When the additional layers improve the quality of theoutput signal without modifying the bandwidth, one speaks rather of“embedded coders” (see for example R. D. Lacovo et al., “Embedded CELPCoding for Variable Bit-Rate Between 6.4 and 9.6 kbit/s, Proc. ICASSP1991, pp. 681-686). Coders of this type do not however allow large gapsbetween the lowest and the highest bit rate proposed.

The hierarchy is often used to progressively increase the bandwidth ofthe signal: the kernel supplies a baseband signal, for exampletelephonic (300-3400 Hz), and the subsequent layers allow the coding ofadditional frequency bands (for example, wide band up to 7 kHz, HiFiband up to 20 kHz or intermediate, etc.). The subband coders or codersusing a time/frequency transformation such as described in the documents“Subband/transform coding using filter banks designs based on timedomain aliasing cancellation” by J. P. Princen et al. (Proc. IEEEICASSP-87, pp. 2161-2164) and “High Quality Audio Transform Coding at 64kbit/s”, by Y. Mahieux et al. (IEEE Trans. Commun., Vol. 42, No. 11,November 1994, pp. 3010-3019), lend themselves particularly to suchoperations.

Moreover, a different coding technique is frequently used for the kerneland for the module or modules coding the additional layers, one thenspeaks of various coding stages, each stage consisting of a subcoder.The subcoder of the stage of a given level will be able either to codeparts of the signal that are not coded by the previous stages, or tocode the coding residual of the previous stage, the residual is obtainedby subtracting the decoded signal from the original signal.

The advantage of such structures it that they make it possible to godown to relatively low bit rates with sufficient quality, whileproducing good quality at high bit rate. Specifically, the techniquesused for low bit rates are not generally effective at high bit rates andvice versa.

Such structures making it possible to use two different technologies(for example CELP and time/frequency transform, etc.) are especiallyeffective for sweeping large bit rate ranges.

However, the hierarchical coding structures proposed in the prior artdefine precisely the bit rate allocated to each of the intermediatelayers. Each layer corresponds to the encoding of certain parameters,and the granularity of the hierarchical binary train depends on the bitrate allocated to these parameters (typically a layer can contain of theorder of a few tens of bits per frame, a signal frame consisting of acertain number of samples of the signal over a given duration, theexample described later considering a frame of 960 samples correspondingto 60 ms of signal).

Moreover, when the bandwidth of the decoded signals can vary accordingto the level of the layers of binary elements, the modification of theline bit rate may produce artifacts that impede listening.

The present invention has the aim in particular of proposing a multiratecoding solution which alleviates the drawbacks cited in the case of theuse of existing hierarchical and switchable codings.

The invention thus proposes a method of coding a digital audio signalframe as a binary output sequence, in which a maximum number Nmax ofcoding bits is defined for a set of parameters that can be calculatedaccording to the signal frame, which set is composed of a first and of asecond subset. The proposed method comprises the following steps:

-   -   calculating the parameters of the first subset, and coding these        parameters on a number N0 of coding bits such that N0<Nmax;    -   determining an allocation of Nmax−N0 coding bits for the        parameters of the second subset; and    -   ranking the Nmax−N0 coding bits allocated to the parameters of        the second subset in a determined order.

The allocation and/or the order of ranking of the Nmax−N0 coding bitsare determined as a function of the coded parameters of the firstsubset. The coding method furthermore comprises the following steps inresponse to the indication of a number N of bits of the binary outputsequence that are available for the coding of said set of parameters,with N0<N≦Nmax:

-   -   selecting the second subset's parameters to which are allocated        the N−N0 coding bits ranked first in said order;    -   calculating the selected parameters of the second subset, and        coding these parameters so as to produce said N−N0 coding bits        ranked first; and    -   inserting into the output sequence the N0 coding bits of the        first subset as well as the N−N0 coding bits of the selected        parameters of the second subset.

The method according to the invention makes it possible to define amultirate coding, which will operate at least in a range correspondingfor each frame to a number of bits ranging from N0 to Nmax.

It may thus be considered that the notion of pre-established bit rateswhich is related to the existing hierarchical and switchable codings isreplaced by a notion of “cursor”, making it possible to freely vary thebit rate between a minimum value (that may possibly correspond to anumber of bits N less than N0) and a maximum value (corresponding toNmax). These extreme values are potentially far apart. The method offersgood performance in terms of effectiveness of coding regardless of thebit rate chosen.

Advantageously, the number N of bits of the binary output sequence isstrictly less than Nmax. What is noteworthy about the coder is then thatthe allocation of the bits that is employed makes no reference to theactual output bit rate of the coder, but to another number Nmax agreedwith the decoder.

It is however possible to fix Nmax=N as a function of the instantaneousbit rate available on a transmission channel. The output sequence of aswitchable multirate coder such as this may be processed by a decoderwhich does not receive the entire sequence, so long as it is capable ofretrieving the structure of the coding bits of the second subset byvirtue of the knowledge of Nmax.

Another case where it is possible to have N=Nmax is that of the storageof audio data at the maximum coding rate. When reading N′ bits of thiscontent stored at lower bit rate, the decoder would be capable ofretrieving the structure of the coding bits of the second subset as longas N′≧N0.

The order of ranking of the coding bits allocated to the parameters ofthe second subset may be a preestablished order.

In a preferred embodiment, the order of ranking of the coding bitsallocated to the parameters of the second subset is variable. It may inparticular be an order of decreasing importance determined as a functionof at least the coded parameters of the first subset. Thus the decoderwhich receives a binary sequence of N′ bits for the frame, withN0≦N′≦N≦Nmax, will be able to deduce this order from the N0 bitsreceived for the coding of the first subset.

The allocation of the Nmax−N0 bits to the coding of the parameters ofthe second subset may be carried out in a fixed manner (in this case,the order of ranking of these bits will be dependent at least on thecoded parameters of the first subset).

In a preferred embodiment, the allocation of the Nmax−N0 bits to thecoding of the parameters of the second subset is a function of the codedparameters of the first subset.

Advantageously, this order of ranking of the coding bits allocated tothe parameters of the second subset is determined with the aid of atleast one psychoacoustic criterion as a function of the coded parametersof the first subset.

The parameters of the second subset pertain to spectral bands of thesignal. In this case, the method advantageously comprises a step ofestimating a spectral envelope of the coded signal on the basis of thecoded parameters of the first subset, and a step of calculating a curveof frequency masking by applying an auditory perception model to theestimated spectral envelope, and the psychoacoustic criterion makesreference to the level of the estimated spectral envelope with respectto the masking curve in each spectral band.

In a mode of implementation, the coding bits are ordered in the outputsequence in such a way that the N0 coding bits of the first subsetprecede the N−N0 coding bits of the selected parameters of the secondsubset and that the respective coding bits of the selected parameters ofthe second subset appear therein in the order determined for said codingbits. This makes it possible, in the case where the binary sequence istruncated, to receive the most important part.

The number N may vary from one frame to another, in particular as afunction for example of the available capacity of the transmissionresource.

The multirate audio coding according to the present invention may beused according to a very flexible hierarchical or switchable mode, sinceany number of bits to be transmitted chosen freely between N0 and Nmaxmay be selected at any moment, that is to say frame by frame.

The coding of the parameters of the first subset may be at variable bitrate, thereby varying the number N0 from one frame to another. Thisallows best adjustment of the distribution of the bits as a function ofthe frames to be coded.

In a mode of implementation, the first subset comprises parameterscalculated by a coder kernel. Advantageously, the coder kernel has alower frequency band of operation than the bandwidth of the signal to becoded, and the first subset furthermore comprises energy levels of theaudio signal that are associated with frequency bands higher than theoperating band of the coder kernel. This type of structure is that of ahierarchical coder with two levels, which delivers for example via thecoder kernel a coded signal of a quality deemed to be sufficient andwhich, as a function of the bit rate available, supplements the codingperformed by the coder kernel with additional information arising fromthe method of coding according to the invention.

Preferably, the coding bits of the first subset are then ordered in theoutput sequence in such a way that the coding bits of the parameterscalculated by the coder kernel are immediately followed by the codingbits of the energy levels associated with the higher frequency bands.This ensures one and the same bandwidth for the successively codedframes as long as the decoder receives enough bits to be in possessionof information of the coder kernel and coded energy levels associatedwith the higher frequency bands.

In a mode of implementation, a signal of difference between the signalto be coded and a synthesis signal derived from the coded parametersproduced by the coder kernel is estimated, and the first subsetfurthermore comprises energy levels of the difference signal that areassociated with frequency bands included in the operating band of thecoder kernel.

A second aspect of the invention pertains to a method of decoding abinary input sequence so as to synthesize a digital audio signalcorresponding to the decoding of a frame coded according to the methodof coding of the invention. According to this method, a maximum numberNmax of coding bits is defined for a set of parameters for describing asignal frame, which set is composed of a first and a second subset. Theinput sequence comprises, for a signal frame, a number N′ of coding bitsfor the set of parameters, with N′≦Nmax. The decoding method accordingto the invention comprises the following steps:

-   -   extracting, from said N′ bits of the input sequence, a number N0        of coding bits of the parameters of the first subset if N0<N′;    -   recovering the parameters of the first subset on the basis of        said N0 coding bits extracted;    -   determining an allocation of Nmax−N0 coding bits for the        parameters of the second subset; and    -   ranking the Nmax−N0 coding bits allocated to the parameters of        the second subset in a determined order.

The allocation and/or the order of ranking of the Nmax−N0 coding bitsare determined as a function of the recovered parameters of the firstsubset. The decoding method furthermore comprises the following steps:

-   -   selecting the second subset's parameters to which are allocated        the N′−N0 coding bits ranked first in said order;    -   extracting, from said N′ bits of the input sequence, N′−N0        coding bits of the selected parameters of the second subset;    -   recovering the selected parameters of the second subset on the        basis of said N′−N0 coding bits extracted; and    -   synthesizing the signal frame by using the recovered parameters        of the first and second subsets.

This method of decoding is advantageously associated with procedures forregenerating the parameters which are missing on account of thetruncation of the sequence of Nmax bits that is produced, virtually orotherwise, by the coder.

A third aspect of the invention pertains to an audio coder, comprisingmeans of digital signal processing that are devised to implement amethod of coding according to the invention.

Another aspect of the invention pertains to an audio decoder, comprisingmeans of digital signal processing that are devised to implement amethod of decoding according to the invention.

Other features and advantages of the present invention will becomeapparent in the description hereinbelow of nonlimiting exemplaryembodiments, with reference to the appended drawings, in which:

FIG. 1 is a schematic diagram of an exemplary audio coder according tothe invention;

FIG. 2 represents a binary output sequence of N bits in a embodiment ofthe invention; and

FIG. 3 is a schematic diagram of an audio decoder according to theinvention.

The coder represented in FIG. 1 has a hierarchical structure with twocoding stages. A first coding stage 1 consists for example of a coderkernel in a telephone band (300-3400 Hz) of CELP type. This coder is inthe example considered a G.723.1 coder standardized by the ITU-T(“International Telecommunication Union”) in fixed mode at 6.4 kbit/s.It calculates G.723.1 parameters in accordance with the standard andquantizes them by means of 192 coding bits P1 per frame of 30 ms.

The second coding stage 2, making it possible to increase the bandwidthtowards the wide band (50-7000 Hz), operates on the coding residual E ofthe first stage, supplied by a subtractor 3 in the diagram of FIG. 1. Asignals synchronization module 4 delays the audio signal frame S by thetime taken by the processing of the coder kernel 1. Its output isaddressed to the subtractor 3 which subtracts from it the syntheticsignal S′ equal to the output of the decoder kernel operating on thebasis of the quantized parameters such as represented by the output bitsP1 of the coder kernel. As is usual, the coder 1 incorporates a localdecoder supplying S′.

The audio signal to be coded S has for example a bandwidth of 7 kHz,while being sampled at 16 kHz. A frame consists for example of 960samples, i.e. 60 ms of signal or two elementary frames of the coderkernel G.723.1. Since the latter operates on signals sampled at 8 kHz,the signal S is subsampled in a factor 2 at the input of the coderkernel 1. Likewise, the synthetic signal S′ is oversampled at 16 kHz atthe output of the coder kernel 1.

The bit rate of the first stage 1 is 6.4 kbit/s (2×N1=2×192=384 bits perframe). If the coder has a maximum bit rate of 32 kbit/s (Nmax=1920 bitsper frame), the maximum bit rate of the second stage is 25.6 kbit/s(1920−384=1536 bits per frame). The second stage 2 operates for exampleon elementary frames, or subframes, of 20 ms (320 samples at 16 kHz).

The second stage 2 comprises a time/frequency transformation module 5,for example of MDCT (“Modified Discrete Cosine Transform”) type to whichthe residual E obtained by the subtractor 3 is addressed. In practice,the manner of operation of the modules 3 and 5 represented in FIG. 1 maybe achieved by performing the following operations for each 20 mssubframe:

-   -   MDCT transformation of the input signal S delayed by the module        4, which supplies 320 MDCT coefficients. The spectrum being        limited to 7225 Hz, only the first 289 MDCT coefficients are        different from 0;    -   MDCT transformation of the synthetic signal S′. Since one is        dealing with the spectrum of a telephone band signal, only the        first 139 MDCT coefficients are different from 0 (up to 3450        Hz); and    -   calculation of the spectrum of difference between the previous        spectra.

The resulting spectrum is distributed into several bands of differentwidths by a module 6. By way of example, the bandwidth of the G.723.1codec may be subdivided into 21 bands while the higher frequencies aredistributed into 11 additional bands. In these 11 additional bands, theresidual E is identical to the input signal S.

A module 7 performs the coding of the spectral envelope of the residualE. It begins by calculating the energy of the MDCT coefficients of eachband of the difference spectrum. These energies are hereinbelow referredto as “scale factors”. The 32 scale factors constitute the spectralenvelope of the difference signal. The module 7 then proceeds to theirquantization in two parts. The first part corresponds to the telephoneband (first 21 bands, from 0 to 3450 Hz), the second to the high bands(last 11 bands, from 3450 to 7225 Hz) . In each part, the first scalefactor is quantized on an absolute basis, and the subsequent ones on adifferential basis, by using a conventional Huffman coding with variablebit rate. These 32 scale factors are quantized on a variable numberN2(i) of bits P2 for each subframe of rank i (i=1, 2, 3).

The quantized scale factors are denoted FQ in FIG. 1. The quantizationbits P1, P2 of the first subset consisting of the quantized parametersof the coder kernel 1 and the quantized scale factors FQ are variable innumber N0=(2×N1)+N2(1)+N2(2)+N2(3). The differenceNmax−N0=1536−N2(1)−N2(2)−N2(3) is available to quantize the spectra ofthe bands more finely.

A module 8 normalizes the MDCT coefficients distributed into bands bythe module 6, by dividing them by the quantized scale factors FQrespectively determined for these bands. The spectra thus normalized aresupplied to the quantization module 9 which uses a vector quantizationscheme of known type. The quantization bits arising from the module 9are denoted P3 in FIG. 1.

An output multiplexer 10 gathers together the bits P1, P2 and P3 arisingfrom the modules 1, 7 and 9 to form the binary output sequence Φ of thecoder.

In accordance with the invention, the total number of bits N of theoutput sequence representing a current frame is not necessarily equal toNmax. It may be less than the latter. However, the allocation of thequantization bits to the bands is performed on the basis of the numberNmax.

In the diagram of FIG. 1, this allocation is performed for each subframeby the module 12 on the basis of the number Nmax−N0, of the quantizedscale factors FQ and of a spectral masking curve calculated by a module11.

The manner of operation of the latter module 11 is as follows. Itfirstly determines an approximate value of the original spectralenvelope of the signal S on the basis of that of the difference signal,such as quantized by the module 7, and of that which it determines withthe same resolution for the synthetic signal S′ resulting from the coderkernel. These last two envelopes are also determinable by a decoderwhich is provided only with the parameters of the aforesaid firstsubset. Thus the estimated spectral envelope of the signal S will alsobe available to the decoder. Thereafter, the module 11 calculates aspectral masking curve by applying, in a manner known per se, a model ofband by band auditory perception to the original estimated spectralenvelope. This curve 11 gives a masking level for each band considered.

The module 12 carries out a dynamic allocation of the Nmax−N0 remainingbits of the sequence Φ among the 3×32 bands of the three MDCTtransformations of the difference signal. In the implementation of theinvention set forth here, as a function of a criterion of psychoacousticperceptual importance making reference to the level of the spectralenvelope estimated with respect to the masking curve in each band, a bitrate proportional to this level is allocated to each band. Other rankingcriteria would be useable.

Subsequent to this allocation of bits, the module 9 knows how many bitsare to be considered for the quantization of each band in each subframe.

Nevertheless, if N<Nmax, these allocated bits will not necessarily allbe used. An ordering of the bits representing the bands is performed bya module 13 as a function of a criterion of perceptual importance. Themodule 13 ranks the 3×32 bands in an order of decreasing importancewhich may be the decreasing order of the signal-to-mask ratios (ratiobetween the estimated spectral envelope and the masking curve in eachband). This order is used for the construction of the binary sequence Φin accordance with the invention.

As a function of the desired number N of bits in the sequence Φ for thecoding of the current frame, the bands which are to be quantized by themodule 9 are determined by selecting the bands ranked first by themodule 13 and by keeping for each band selected a number of bits such asis determined by the module 12.

Then the MDCT coefficients of each band selected are quantized by themodule 9, for example with the aid of a vector quantizer, in accordancewith the allocated number of bits, so as to produce a total number ofbits equal to N−N0.

The output multiplexer 10 builds the binary sequence Φ consisting of thefirst N bits of the following ordered sequence represented in FIG. 2(case N=Nmax):

-   -   a/ firstly the binary trains corresponding to the two G.723.1        frames (384 bits);    -   b/ next the bits F₂₂ ^((i)), . . . , F₃₂ ^((i)) for quantizing        the scale factors, for the three subframes (i=1, 2, 3), from the        22nd spectral band (first band beyond the telephone band) to the        32nd band (variable rate Huffman coding);    -   c/ next the bits F₁ ^((i)), . . . , F₂₁ ^((i)) for quantizing        the scale factors, for the three subframes (i=1, 2, 3), from the        1st spectral band to the 21st band (variable rate Huffman        coding);    -   d/ and finally the indices M_(c1), M_(c2), . . . , M_(c96) of        vector quantization of the 96 bands in order of perceptual        importance, from the most important band to the least important        band, while complying with the order determined by the module        13.

By placing first (a and b) the G.723.1 parameters and the scale factorsof the high bands it is possible to retain the same bandwidth for thesignal restorable by the decoder regardless of the actual bit ratebeyond a minimum value corresponding to the reception of these groups aand b. This minimum value, sufficient for the Huffman coding of the3×11=33 scale factors of the high bands in addition to the G.723.1coding, is for example 8 kbit/s.

The method of coding hereinabove allows a decoding of the frame if thedecoder receives N′ bits with N0≦N′≦N. This number N′ will generally bevariable from one frame to another.

A decoder according to the invention, corresponding to this example, isillustrated by FIG. 3. A demultiplexer 20 separates the sequence of bitsreceived Φ′ so as to extract therefrom the coding bits P1 and P2. The384 bits P1 are supplied to the decoder kernel 21 of G.723.1 type sothat the latter synthesizes two frames of the base signal S′ in thetelephone band. The bits P2 are decoded according to the Huffmanalgorithm by a module 22 which thus recovers the quantized scale factorsFQ for each of the 3 subframes.

A module 23 calculating the masking curve, identical to the module 11 ofthe coder of FIG. 1, receives the base signal S′ and the quantized scalefactors FQ and produces the spectral masking levels for each of the 96bands. On the basis of these masking levels, of the quantized scalefactors FQ and of the knowledge of the number Nmax (as well as of thatof the number N0 which is deduced from the Huffman decoding of the bitsP2 by the module 22), a module 24 determines an allocation of bits inthe same manner as the module 12 of FIG. 1. Furthermore, a module 25proceeds to the ordering of the bands according to the same rankingcriterion as the module 13 described with reference to FIG. 1.

According to the information supplied by the modules 24 and 25, themodule 26 extracts the bits P3 of the input sequence Φ′ and synthesizesthe normalized MDCT coefficients relating to the bands represented inthe sequence Φ′. If appropriate (N′<Nmax), the standardized MDCTcoefficients relating to the missing bands may furthermore besynthesized by interpolation or extrapolation as described hereinbelow(module 27). These missing bands may have been eliminated by the coderon account of a truncation to N<Nmax, or they may have been eliminatedin the course of transmission (N′<N).

The standardized MDCT coefficients, synthesized by the module 26 and/orthe module 27, are multiplied by their respective quantized scalefactors (multiplier 28) before being presented to the module 29 whichperforms the frequency/time transformation which is the inverse of theMDCT transformation operated by the module 5 of the coder. The temporalcorrection signal which results therefrom is added to the syntheticsignal S′ delivered by the decoder kernel 21 (adder 30) to produce theoutput audio signal Ŝ of the decoder.

It should be noted that the decoder will be able to synthesize a signalŜ even in cases where it does not receive the first N0 bits of thesequence.

It is sufficient for it to receive the 2×N1 bits corresponding to thepart a of the listing hereinabove, the decoding then being in a“degraded” mode. Only this degraded mode does not use the MDCT synthesisto obtain the decoded signal. To ensure the switching with no breakbetween this mode and the other modes, the decoder performs three MDCTanalyses followed by three MDCT syntheses, allowing the updating of thememories of the MDCT transformation. The output signal contains a signalof telephone band quality. If the first 2×N1 bits are not even received,the decoder considers the corresponding frame as having been erased andcan use a known algorithm for conceiving erased frames.

If the decoder receives the 2×N1 bits corresponding to part a plus bitsof part b (high bands of the three spectral envelopes), it can begin tosynthesize a wide band signal. It can in particular proceed as follows.

-   -   1/ The module 22 recovers the parts of the three spectral        envelopes received.    -   2/ The bands not received have their scale factors temporarily        set to zero.    -   3/ The low parts of the spectral envelopes are calculated on the        basis of the MDCT analyses performed on the signal obtained        after the G.723.1 decoding, and the module 23 calculates the        three masking curves on the envelopes thus obtained.    -   4/ The spectral envelope is corrected so as to regularize it by        avoiding the nulls due to the bands not received; the zero        values in the high part of the spectral envelopes FQ are for        example replaced by a hundredth of the value of the masking        curve calculated previously, so that they remain inaudible. The        complete spectrum of the low bands and the spectral envelope of        the high bands are known at this juncture.    -   5/ The module 27 then generates the high spectrum. The fine        structure of these bands is generated by reflection of the fine        structure of its known neighborhood before weighting by the        scale factors (multipliers 28). In the case where none of the        bits P3 is received, the “known neighborhood” corresponds to the        spectrum of the signal S′ produced by the G.723.1 decoder        kernel. Its “reflection” can consist in copying the value of the        standardized MDCT spectrum, possibly with its variations being        attenuated in proportion to the distance away from the “known        neighborhood”.    -   6/ After inverse MDCT transformation (29) and addition (30) of        the resulting correction signal to the output signal of the        decoder kernel, the wide band synthesized signal is obtained.

In the case where the decoder also receives part at least of the lowspectral envelope of the difference signal (part c), it may or may nottake this information into account to refine the spectral envelope instep 3.

If the decoder 10 receives enough bits P3 to decode at least the MDCTcoefficients of the most important band, ranked first in the part d ofthe sequence, then the module 26 recovers certain of the normalized MDCTcoefficients according to the allocation and ordering that are indicatedby the modules 24 and 25. These MDCT coefficients therefore need not beinterpolated as in step 5 hereinabove. For the other bands, the processof steps 1 to 6 is applicable by the module 27 in the same manner aspreviously, the knowledge of the MDCT coefficients received for certainbands allowing more reliable interpolation in step 5.

The bands not received may vary from one MDCT subframe to the next. The“known neighborhood” of a missing band may correspond to the same bandin another subframe where it is not missing, and/or to one or more bandsclosest in the frequency domain in the course of the same subframe. Itis also possible to regenerate an MDCT spectrum missing from a band fora subframe by calculating a weighted sum of contributions evaluated onthe basis of several bands/subframes of the “known neighborhood”.

Insofar as the actual bit rate of N′ bits per frame places the last bitof a given frame arbitrarily, the last coded parameter transmitted may,according to case, be transmitted completely or partially. Two cases maythen arise:

-   -   either the coding structure adopted makes it possible to utilize        the partial information received (case of scalar quantizers, or        of vector quantization with partitioned dictionaries),    -   or it does not allow it and the parameter not fully received is        processed like the other parameters not received. It is noted        that, for this latter case, if the order of the bits varies with        each frame, the number of bits thus lost is variable and the        selection of N′ bits will produce on average, over the whole set        of frames decoded, a better quality than that which would be        obtained with a smaller number of bits.

1. A method of coding a digital audio signal frame as a binary outputsequence, in which a maximum number Nmax of coding bits is defined for aset of parameters that can be calculated according to the signal frame,which set is composed of a first and of a second subset, the methodcomprising the following steps: calculating the parameters of the firstsubset, and coding these parameters on a number N0 of coding bits suchthat N0<Nmax; determining an allocation of Nmax−N0 coding bits for theparameters of the second subset; and ranking the Nmax−N0 coding bitsallocated to the parameters of the second subset in a determined order,in which the allocation and/or the order of ranking of the Nmax−N0coding bits is determined as a function of the coded parameters of thefirst subset, the method furthermore comprising the following steps inresponse to the indication of a number N of bits of the binary outputsequence that are available for the coding of said set of parameters,with N0<N≦Nmax: selecting the second subset's parameters to which areallocated the N−N0 coding bits ranked first in said order; calculatingthe selected parameters of the second subset, and coding theseparameters so as to produce said N−N0 coding bits ranked first; andinserting into the output sequence the N0 coding bits of the firstsubset as well as the N−N0 coding bits of the selected parameters of thesecond subset.
 2. The method as claimed in claim 1, in which the orderof ranking of the coding bits allocated to the parameters of the secondsubset is variable from one frame to another.
 3. The method as claimedin claim 1, in which N<Nmax.
 4. The method as claimed in claim 1, inwhich the order of ranking of the coding bits allocated to theparameters of the second subset is an order of decreasing importancedetermined as a function of at least the coded parameters of the firstsubset.
 5. The method as claimed in claim 4, in which the order ofranking of the coding bits allocated to the parameters of the secondsubset is determined with the aid of at least one psychoacousticcriterion as a function of the coded parameters of the first subset. 6.The method as claimed in claim 5, in which the parameters of the secondsubset pertain to spectral bands of the signal, in which a spectralenvelope of the coded signal is estimated on the basis of the codedparameters of the first subset, in which a curve of frequency masking iscalculated by applying an auditory perception model to the estimatedspectral envelope, and in which the psychoacoustic criterion makesreference to the level of the estimated spectral envelope with respectto the masking curve in each spectral band.
 7. The method as claimed inclaim 4, in which Nmax=N.
 8. The method as claimed in claim 1, in whichthe coding bits are ordered in the output sequence in such a way thatthe N0 coding bits of the first subset precede the N−N0 coding bits ofthe selected parameters of the second subset and that the respectivecoding bits of the selected parameters of the second subset appeartherein in the order determined for said coding bits.
 9. The method asclaimed in claim 1, in which the number N varies from one frame toanother.
 10. The method as claimed in claim 1, in which the coding ofthe parameters of the first subset is at variable bit rate, therebyvarying the number N0 from one frame to another.
 11. The method asclaimed in claim 1, in which the first subset comprises parameterscalculated by a coder kernel.
 12. The method as claimed in claim 11, inwhich the coder kernel has a lower frequency band of operation than thebandwidth of the signal to be coded, and in which the first subsetfurthermore comprises energy levels of the audio signal that areassociated with frequency bands higher than the operating band of thecoder kernel.
 13. The method as claimed in claim 8, in which the codingbits of the first subset are ordered in the output sequence in such away that the coding bits of the parameters calculated by the coderkernel are immediately followed by the coding bits of the energy levelsassociated with the higher frequency bands.
 14. The method as claimed inclaim 11, in which a signal of difference between the signal to be codedand a synthesis signal derived from the coded parameters produced by thecoder kernel is estimated, and in which the first subset furthermorecomprises energy levels of the difference signal that are associatedwith frequency bands included in the operating band of the coder kernel.15. The method as claimed in claim 8 and claim 12, in which the codingbits of the first subset are ordered in the output sequence in such away that the coding bits of the parameters calculated by the coderkernel are followed by the coding bits of the energy levels associatedwith the frequency band.
 16. A method of decoding a binary inputsequence so as to synthesize a digital audio signal, in which a maximumnumber Nmax of coding bits is defined for a set of parameters fordescribing a signal frame, which set is composed of a first and a secondsubset, the input sequence comprising, for a signal frame, a number N′of coding bits for said set of parameters, with N′≦Nmax, the methodcomprising the following steps: extracting, from said N′ bits of theinput sequence, a number N0 of coding bits of the parameters of thefirst subset if N0<N′; recovering the parameters of the first subset onthe basis of said N0 coding bits extracted; determining an allocation ofNmax−N0 coding bits for the parameters of the second subset; and rankingthe Nmax−N0 coding bits allocated to the parameters of the second subsetin a determined order, in which the allocation and/or the order ofranking of the Nmax−N0 coding bits is determined as a function of therecovered parameters of the first subset, the method furthermorecomprising the following steps: selecting the second subset's parametersto which are allocated the N′−N0 coding bits ranked first in said order;extracting, from said N′ bits of the input sequence, N′−N0 coding bitsof the selected parameters of the second subset; recovering the selectedparameters of the second subset on the basis of said N′−N0 coding bitsextracted; and synthesizing the signal frame by using the recoveredparameters of the first and second subsets.
 17. The method as claimed inclaim 16, in which the order of ranking of the coding bits allocated tothe parameters of the second subset is variable from one frame toanother.
 18. The method as claimed in claim 16, in which N′<Nmax. 19.The method as claimed in claim 16, in which the order of ranking of thecoding bits allocated to the parameters of the second subset is an orderof decreasing importance determined as a function of at least therecovered parameters of the first subset.
 20. The method as claimed inclaim 19, in which the order of ranking of the coding bits allocated tothe parameters of the second subset is determined with the aid of atleast one psychoacoustic criterion as a function of the recoveredparameters of the first subset.
 21. The method as claimed in claim 20,in which the parameters of the second subset pertain to spectral bandsof the signal, in which a spectral envelope of the signal is estimatedon the basis of the recovered parameters of the first subset, in which acurve of frequency masking is calculated by applying an auditoryperception model to the estimated spectral envelope, and in which thepsychoacoustic criterion makes reference to the level of the estimatedspectral envelope with respect to the masking curve in each spectralband.
 22. The method as claimed in claim 16, in which the N0 coding bitsof the parameters of the first subset are extracted from the N′ bitsreceived at positions of the sequence which precede the positions fromwhich are extracted the N′−N0 coding bits of the selected parameters ofthe second subset.
 23. The method as claimed in claim 16, in which, tosynthesize the signal frame, nonselected parameters of the second subsetare estimated by interpolation on the basis of at least selectedparameters recovered on the basis of said N′−N0 coding bits extracted.24. The method as claimed in claim 16, in which the first subsetcomprises input parameters of a decoder kernel.
 25. The method asclaimed in claim 24, in which the decoder kernel has a lower frequencyband of operation than the bandwidth of the signal to be synthesized,and in which the first subset furthermore comprises energy levels of theaudio signal that are associated with frequency bands higher than theoperating band of the decoder kernel.
 26. The method as claimed in claim22, in which the coding bits of the first subset in the input sequenceare ordered in such a way that the coding bits of the input parametersof the decoder kernel are immediately followed by the coding bits of theenergy levels associated with the higher frequency bands.
 27. The methodas claimed in claim 26, comprising the following steps if the N′ bits ofthe input sequence are limited to the coding bits of the inputparameters of the decoder kernel and to part at least of the coding bitsof the energy levels associated with the higher frequency bands:extracting from the input sequence the coding bits of the inputparameters of the decoder kernel and said part of the coding bits of theenergy levels; synthesizing a base signal in the decoder kernel andrecovering energy levels associated with the higher frequency bands onthe basis of said extracted coding bits; calculating a spectrum of thebase signal; assigning an energy level to each higher band with which isassociated an uncoded energy level in the input sequence; synthesizingspectral components for each higher frequency band on the basis of thecorresponding energy level and of the spectrum of the base signal in atleast one band of said spectrum; applying a transformation into the timedomain to the synthesized spectral components so as to obtain a basesignal correction signal; and adding together the base signal and thecorrection signal so as to synthesize the signal frame.
 28. The methodas claimed in claim 27, in which the energy level assigned to a higherband with which is associated an uncoded energy level in the inputsequence is a fraction of a perceptual masking level calculated inaccordance with the spectrum of the base signal and the energy levelsrecovered on the basis of the extracted coding bits.
 29. The method asclaimed in claim 24, in which a base signal is synthesized in thedecoder kernel, and in which the first subset furthermore comprisesenergy levels of a signal of difference between the signal to besynthesized and the base signal that are associated with frequency bandsincluded in the operating band of the coder kernel.
 30. The method asclaimed in claim 25, in which, for N0<N′<Nmax, unselected parameters ofthe second subset that pertain to spectral components in frequency bandsare estimated with the aid of a calculated spectrum of the base signaland/or selected parameters recovered on the basis of said N′<N0 codingbits extracted.
 31. The method as claimed in claim 30, in which theunselected parameters of the second subset in a frequency band areestimated with the aid of a spectral neighborhood of said band, whichneighborhood is determined on the basis of the N′ coding bits of theinput sequence.
 32. The method as claimed in claim 22 and claim 25, inwhich the coding bits of the input parameters of the decoder kernel areextracted from the N′ bits received at positions of the sequence whichprecede the positions from which are extracted the coding bits of theenergy levels associated with the frequency bands.
 33. The method asclaimed in claim 16, in which the number N′ varies from one frame toanother.
 34. The method as claimed in claim 16, in which the number N0varies from one frame to another.
 35. An audio coder, comprising meansof digital signal processing that are devised to implement a method ofcoding a digital audio signal frame as a binary output sequence, inwhich a maximum number Nmax of coding bits is defined for a set ofparameters that can be calculated according to the signal frame, whichset is composed of a first and of a second subset, the method comprisingthe following steps: calculating the parameters of the first subset, andcoding these parameters on a number N0 of coding bits such that N0<Nmax;determining an allocation of Nmax−N0 coding bits for the parameters ofthe second subset; and ranking the Nmax−N0 coding bits allocated to theparameters of the second subset in a determined order, in which theallocation and/or the order of ranking of the Nmax−N0 coding bits isdetermined as a function of the coded parameters of the first subset,the method furthermore comprising the following steps in response to theindication of a number N of bits of the binary output sequence that areavailable for the coding of said set of parameters, with N0<N≦Nmax:selecting the second subset's parameters to which are allocated the N−N0coding bits ranked first in said order; calculating the selectedparameters of the second subset, and coding these parameters so as toproduce said N−N0 coding bits ranked first; and inserting into theoutput sequence the N0 coding bits of the first subset as well as theN−N0 coding bits of the selected parameters of the second subset.
 36. Anaudio decoder, comprising means of digital signal processing that aredevised to implement a method of decoding a binary input sequence so asto synthesize a digital audio signal, in which a maximum number Nmax ofcoding bits is defined for a set of parameters for describing a signalframe, which set is composed of a first and a second subset, the inputsequence comprising, for a signal frame, a number N′ of coding bits forsaid set of parameters, with N′≦Nmax, the method comprising thefollowing steps: extracting, from said N′ bits of the input sequence, anumber N0 of coding bits of the parameters of the first subset if N0<N′;recovering the parameters of the first subset on the basis of said N0coding bits extracted; determining an allocation of Nmax−N0 coding bitsfor the parameters of the second subset; and ranking the Nmax−N0 codingbits allocated to the parameters of the second subset in a determinedorder, in which the allocation and/or the order of ranking of theNmax−N0 coding bits is determined as a function of the recoveredparameters of the first subset, the method furthermore comprising thefollowing steps: selecting the second subset's parameters to which areallocated the N′−N0 coding bits ranked first in said order; extracting,from said N′ bits of the input sequence, N′−N0 coding bits of theselected parameters of the second subset; recovering the selectedparameters of the second subset on the basis of said N′−N0 coding bitsextracted; and synthesizing the signal frame by using the recoveredparameters of the first and second subsets.